Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding
نویسندگان
چکیده
In this paper, we propose a sub-band basis spectrum model which is a new spectrum representation model based on a linear combination of sub-band basis vectors. We apply sparse coding to the pitch-synchronously analyzed log-spectra. Based on the approximation of the resulting basis, we obtain subband basis vectors with 1-cycle sinusoidal shapes that have mel-scale for lower frequencies and equally spaced scale for higher frequencies. Parameters of the sub-band basis spectrum model representing the log spectrum and the phase spectrum are calculated by fitting the basis to the spectrum. Since the parameters represent the shape of a spectrum, it can be easily used for voice adaptation, interpolation and conversion. Experimental results show that the analysis synthesis speech based on the proposed model is close to original speech and that there is no significant difference between the synthetic speech using analysis-synthesis database and those using original database for unit-fusion based TTS[1].
منابع مشابه
Application of spectrum-volume fractal modeling for detection of mineralized zones
The main goal of this research work was to detect the different Cu mineralized zones in the Sungun porphyry deposit in NW Iran using the Spectrum-Volume (S-V) fractal modeling based on the sub-surface data for this deposit. This operation was carried out on an estimated Cu block model based on a Fast Fourier Transformation (FFT) using the C++ and MATLAB programing. The S-V log-log plot was gene...
متن کاملPitch-synchronous Speech Coding Based on Timbre Vectors
A pitch-synchronous method and system for speech coding using timbre vectors is disclosed. On the encoder side, speech signal is segmented into pitch-synchronous frames without overlap, then converted into a pitch-synchronous amplitude spectrum using FFT. Using Laguerre functions, the amplitude spectrum is transformed into a timbre vector. Using vector quantization, each timbre vector is conver...
متن کاملTraffic Scene Analysis using Hierarchical Sparse Topical Coding
Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...
متن کاملEffect of Nitric acid on Particle Morphology of the Nano-TiO2
Nano-sized titanium dioxide TiO2 powder was prepared by new wet chemical route from its precursor Titanium (IV) chloride (TiCl4) as precursor with isopropoxy alcohol in presence of nitric acid under ambient condition. Their morphologies, phase compositions and components of the TiO2 nanoparticles were characterized by transmission electron ...
متن کاملA new approach to modeling excitation in very low-rate speech coding
A new method for two-band approximation of excitation signals in an LPC model, to improve speech naturalness in very low rate coding, is proposed. Based on a simpli ed model of Multi-Band Excitation, the method accurately determines the degree of periodicity, using the concept of Instantaneous Frequency (IF) estimation in frequency domain. The harmonic structure in the spectrum of LPC residual,...
متن کامل